Geographic factors and climatic fluctuation drive the genetic structure and demographic history of Cycas taiwaniana (Cycadaceae), an endemic endangered species to Hainan Island in China

Abstract Hainan Island had experienced several cold–warm and dry–humid fluctuations since the Late Pleistocene period, resulting in separating and connecting from the mainland several times with the cyclic rise and fall of sea level. The fluctuations can change the biota and ecological environment in the island. Cycas taiwaniana Carruthers is endemic to Hainan Island and is classified as endangered by the International Union for Conservation of Nature (IUCN). To comprehensively understand the genetic dynamics of C. taiwaniana, we sampled 12 wild populations in Hainan Island and one cultivated population in Fujian province, and analyzed the genetic diversity, genetic structure, and demographic history based on the molecular data. Results revealed that C. taiwaniana had relatively low genetic diversity and high genetic differentiation. Haplotypes of C. taiwaniana diversified during the Pleistocene based on the chloroplast DNA (cpDNA) and the concatenated nuclear DNA (nDNA) data. Genetic cluster analyses based on the microsatellite (SSR) data showed that the 12 wild populations were separated into three clusters which could be three evolutionary significant units (ESUs), indicating three basic units of protection were identified. Moreover, we also confirmed the cultivated population FJ derived from the DLS1‐GSL clade. Demographic inference from different data was discordant, but overall, it uncovered that C. taiwaniana had experienced population contraction events twice during the Pleistocene and Holocene, and then expanded recently. Our study elucidated the population genetic characteristics of C. taiwaniana, and guided us to develop targeted conservation and management strategies for this endangered species.


| INTRODUC TI ON
Conservation genetics plays an important role in the conservation of rare and endangered species. At a time of rapid loss of biodiversity, it is more necessary to study population genetics and historical dynamics of endangered species (Liu & Zhao, 1999;Miao et al., 2021).
The application value of population genetics lies in the ability to find the key factors which lead to the loss of rare and endangered species. It provides the maximum and accurate conservation strategy by using DNA data to study the genetic structure and the factors causing the genetic change in populations. Population genetics is becoming more and more popular in animals and plants conservation (Qiao et al., 2021;Sanal Demirci et al., 2021;Shan & Diao, 2011;Wang et al., 2021) and is also widely used in cycads Wang et al., 2019;Xiao et al., 2020).
Cycads are one of the oldest and the most ancestral living seed plant which belongs to gymnosperms. They flourished from late Triassic to early Cretaceous of Mesozoic era, and declined gradually until late Cretaceous (Gao & Thomas, 1989). An earlier study showed that living cycad species originated in the past 12 million years (Nagalingum et al., 2011). However, a recent study has shown that cycads are not so young and the mean crown age of extant Cycadaceae is about 69-43 million years . The newest list of cycads consists of 10 genera and 367 species, and the genus Cycas is the largest genus of the extant cycads owning about 118 species (Calonje et al., 2022). According to the latest records, there are about 20 Cycas species in China (Xi et al., 2022), which are all listed in The List of First-level Protected Plants of China. A total of 99 Cycas species were included in the IUCN Red List of threatened species and about 81% of Cycas species were endangered and needed us to protect them (IUCN, 2022).
Cycas taiwaniana was named in 1983 by Willian Carruthers who thought it came from Taiwan based on the words that Hanke had written on the specimen. This species has long been cultivated in China (Guangdong and Fujian Provinces), most of them are from cultivated plants whereas are extremely rare in the wild (Hill, 2008). For a long time, the origin of C. taiwaniana has been unknown, and it is generally treated as a C. taiwaniana complex.
Fortunately, a new study treated C. changjiangensis, C. hainanensis, and C. lingshuigensis as synonyms for C. taiwaniana by using molecular data and preliminary morphological inference in species delimitation analysis, and verified the origin of C. taiwaniana is Hainan Island (Feng et al., 2021).
Hainan Island is considered the second largest island in China.
Due to its unique tectonic position which is on the northern edge of the tropics, tropical monsoon climate and rich biological species, it is recognized as a hotspot of biodiversity conservation in the world (Cai et al., 2021). The flora of Hainan Island is dominated by tropical families, genera and species. From the perspective of biogeography, it is proposed that Hainan Island was connected to Vietnam and the Guangxi province of China during the Eocene, and then moved and rotated to the southeast and finally reached the present position (Zhu, 2020). Due to the low latitude of Hainan Island, some ancient plants and animals can be preserved. Some researchers found Cycadaceae palynological fossils of the late Pleistocene strata in Hainan Island (Yan, 2006). Moreover, a biogeographical analysis also showed that the extant Cycas species were of South China origin (Xiao & Moller, 2015). Since the Quaternary, Hainan Island may have been connected to Guangdong Leizhou Peninsula several times due to the rise and fall of sea level, which lead to the biota complicated in Hainan Island (Zhu, 2020). Cycas taiwaniana was endemic to Hainan Island and distributed mainly in Limuling Mountain and Wuzhi Mountain.
In this study, we used four chloroplast DNA (cpDNA) intergenic spacers, four nuclear genes and 10 microsatellites to perform comprehensive population genetics analysis of C. taiwaniana. We aim to: (i) evaluate the genetic diversity, genetic differentiation and genetic clusters of C. taiwaniana, (ii) determine the phylogenetic divergence time estimation, (iii) explore the demographic history of C. taiwaniana by combining with the geological history of Hainan Island, and (iv) provide the basic and practical guidelines for its conservation strategies.

| Plant materials
A total of 13 populations of C. taiwaniana were collected by field investigation. Among them, the population FJ was a cultivated population. After collection, fresh and healthy leaves were immediately dried in silica gel. Ten individuals from each population were randomly selected for DNA sequencing, 20 individuals were selected for microsatellite analysis, and the populations with fewer than 10 or 20 individuals were all sampled. In total, we used 128 and 188 individuals for DNA sequencing and microsatellite genotyping, respectively. Sampling detail information and distribution were summarized in Table S1 and Figure 1. Distribution maps were drawn using the ArcGIS v.10.2 (http://deskt op.arcgis.com).
Polymerase chain reaction (PCR) amplification procedures and PCR amplification system were the same as those used in Feng et al. (2014) and Feng, Liu, and Gong (2016). DNA sequencing was carried out by using an ABI 3770 automated sequencer, and PCR products of the targeted microsatellite loci were separated and visualized using an ABI 3730XL automated sequencer at Kunming Branch of Tsingke Biotechnology Co., Ltd. Microsatellite profiles were read with GeneMapper v.4.0 software (Applied Biosystems).
All sequences were deposited in GenBank with the accession numbers (Table S3).

| Data analysis
2.3.1 | Data analysis of DNA sequences DNA sequences were edited and assembled using the SeqMan (Swindell & Plasterer, 1997). Multiple alignments of DNA sequences were performed in Bioedit v7.0.4.1 (Hall, 1999). We combined the four cpDNA fragments and used the combined cpDNA sequence in the subsequent analyses.
We applied the secondary calibration point (27.9193 Mya) to estimate the divergence time of haplotypes for cpDNA and the concatenated nDNA with C. szechuanensis as the outgroup . Nucleotide substitution models were tested in PhyloSuite v1.2.2 (Lanfear et al., 2017;Zhang et al., 2020). The divergence times were estimated in BEAST v1.6.1 (Drummond & Rambaut, 2007). The HKY with a relaxed lognormal molecular clock model and the GTR + G4 with a relaxed lognormal molecular clock model were used for cpDNA and nDNA, respectively. A normal prior with a mean value at 27.9193 and a stdev value at 0.5 for the Yule birth rate were both used in cpDNA and nDNA.
The mutation rate posterior estimates and the time of divergence were obtained by Markov Chain Monte Carlo (MCMC) analysis.
Log parameters were sampled every 10,000 iterations under 100,000,000 generations. The analysis ran three times under the same condition and all parameters were stabilized. All the effective sample size parameters were required to exceed 200 checked in TRACER v1.5 (Rambaut & Drummond, 2009). The log and tree files were combined with LogCombiner v1.6.1 and the consensus tree was generated with TreeAnnotator v1.6.1 with the first 25% burn-in. All the results were checked in the Figtree v.1.4.3 (Rambaut et al., 2016).
To gain more comprehensive inferring and understanding of the historical demography of C. taiwaniana, we analyzed the Bayesian Skyline Plot (BSP) of cpDNA data and the four nuclear genes respectively and the Extended Bayesian Skyline Plot (EBSP) of all genes with unlinked model in the BEAST program (Heled & Drummond, 2008). We also used the DnaSP v5.0 software to investigate population dynamics of the species by performing a pairwise F I G U R E 1 The morphological characteristics and sampling detail distribution of Cycas taiwaniana mismatch distribution analysis and neutrality tests which included Tajima's D, Fu and Li′s D* and F* and Fu's F S (Fu, 1997). The sumof-squared deviations (SSD) and raggedness index as well as their p values were calculated in Arlequin v3.11. in GenAlEx 6.51b2 (Peakall & Smouse, 2012). Allelic richness (A R ), total genetic diversity for species (H T ), and coefficient of gene differentiation (G ST ) were estimated in FSTAT v1.2 (Goudet, 1995).

| Data analysis of microsatellites
Using the software Arlequin v3.11 to conduct AMOVA analysis. A Bayesian analysis of population structure was conducted with STRUCTURE v2.3.4 (Pritchard et al., 2000). The analysis was conducted under the admixture model and correlated allele frequencies, with the number of genetic clusters (K) set to (K = 1-20). Each K was run with 20 independent iterations and each run included a burn in of 100,000 iterations and 100,000 subsequent MCMC steps.
The effective population sizes (Ne) of each population were tested under random mating system in LDNe (Waples & Do, 2008).
To ensure the credibility of the analysis results, we chose the lowest allele frequency at 0.05 level with a 95% confidence interval.
We detected the bottleneck effect at the population level to explore population demography using two methods (Sign tests and Wilcoxon tests) under the two models: stepwise mutation model (SMM) and two-phased model (TPM). A mode shift model was also used to test for bottlenecks in each population in BOTTLENECK v1.2.02 (Cornuet & Luikart, 1996;Piry et al., 1999). Parameters for TPM were set up as Variance = 30.00, Probability = 90.00% and estimation based on 10,000 replications. We further investigated a genetic bottleneck using the Garza-Williamson index (Garza & Williamson, 2001) which was calculated in Arlequin v3.11.

TA B L E 1
Haplotype diversity (Hd) and nucleotide diversity (Pi) estimated from cpDNA and nuclear genes The alignment length of PHYP was 932 bp with 11 polymorphic sites, detecting 12 haplotypes (taiP1-taiP12). The Pi and the Hd of C. taiwaniana were 0.00217 and 0.735, respectively ( Table 1). The alignment length of PPRC was 710 bp with 19 polymorphic sites, and 11 haplotypes (taiR1-taiR11) were identified. The Pi and the Hd of C. taiwaniana based on the nuclear gene PPRC were 0.00188 and 0.617, respectively ( Table 1). The alignment length of AAT was 538 bp with 29 polymorphic sites, detecting 43 haplotypes (taiT1-taiT43). The Pi and the Hd of C. taiwaniana based on the nuclear gene AAT were 0.00864 and 0.896, respectively ( Table 1).
The A total of 120 alleles were identified at the 10 microsatellite loci (Table S5). Diversity estimates varied in different populations (Table 2). There was no private allele in populations DLS2, BLS and NWH, but the population DL had the most private alleles (N P = 5).
Allelic richness was lowest in population FJ (A R = 2.200) and highest in population DL (A R = 5.171). tion between populations based on the cpDNA data was significant.
Therefore, there was almost no gene flow between populations.
Whereas the genetic differentiation between populations was smaller and the gene flow between populations was larger based on nuclear data. Gene flows between each pair of the 13 populations based on the SSR data were shown in Table S6. Population DLS2 had TA B L E 2 Genetic diversity within populations of Cycas taiwaniana based on the microsatellites data   (Table 2).  (Table S7). However, the rest two nuclear genes and cpDNA showed C. taiwaniana had no distinct phylogeographical structure.  Table S1.

| Network of haplotypes and divergence time estimation
Genealogies reflecting haplotypes topology and frequency were constructed based on cpDNA and four nuclear genes (AC5, PHYP, PPRC, and AAT; Figures 2 and 3). In the cpDNA haplotype network, the haplotype taiH1 was located at the internal node and appeared the most frequently. Based on the nuclear genes data, the highest frequency haplotype was always distributed in each population.
For the nuclear gene AC5, the highest frequency haplotypes were

| Genetic cluster and Barrier analysis
The STRUCTURE analysis showed that the optimal K value was

| Population history dynamics
Values for neutrality test and mismatch distribution analysis were listed in Table S8 and Figure 7. Fu's Fs was more sensitive to population expansion than the other three neutrality test parameters.
Based on nuclear genes AC5 and AAT, Fu's Fs were significant negative values, SSD and raggedness index were not significant, and the mismatch distribution analyses showed a unimodal curve, indicating that C. taiwaniana experienced an expansion event (Figure 7). The bottleneck analyses were used to calculate heterozygosity excess test and mutation-drift equilibrium as estimated with different models and different methods (

| Low genetic diversity and genetic differentiation
Genetic diversity, species diversity, and ecosystem diversity are three main pillars of biodiversity. On the basis of principle of population genetics, the conservation of biodiversity is ultimately the conservation of genetic diversity (Liu & Zhao, 1999). The early evaluation on Based on cpDNA data, the genetic diversity of C. taiwaniana was generally low, whose Pi was lower than C. segmentifida, C. chenii,

F I G U R E 5 (a, b) Bayesian inference using STRUCTURE of Cycas taiwaniana; (c) Principal coordinate analysis of SSR data from 13
populations of 188 individuals of C. taiwaniana.

F I G U R E 6
The boundaries detected using the Barrier program based on matrices of Nei and Chesser (1983) unbiased genetic distance.
C. debaoensis, C. bifida, and C. diannanensis, but the Hd was higher than those of the above five species. The nucleotide diversity and haplotype diversity of C. taiwaniana was lower than C. simplicipinna, C. panzhihuaensis, C. multipinnata, C. dolichophylla, and C. guizhouensis (Table 5; Zheng et al., 2017).
Due to the different evolutionary history of nuclear genes,

F I G U R E 8 The Extend Bayesian
Skyline Plot based on cpDNA and the four nuclear genes for the estimate of fluctuations in effective population size over time. Black line: median estimation; area between gray lines: 95% confidence interval.
The AMOVA results were consistent with those of other studies, which showed that the genetic differentiations of many Cycas species were distributed among populations based on the cpDNA data (Feng et al., 2014Feng, Zheng, & Gong, 2016;Wang et al., 2019;Xiao et al., 2020;Yang et al., 2017;Zheng et al., 2016). Moreover, nuclear genes explored the opposite results (Feng et al., 2014;Feng, Zheng, & Gong, 2016;Wang et al., 2019;Xiao et al., 2020;Yang et al., 2017), genetic variations were mainly distributed within populations with the species C. dolichophylla as an exception, whose genetic variations among populations was greater than that within populations . In Cycas, the chloroplast DNAs are maternal inheritance (Zhong et al., 2011), and genetic materials are transmitted by seeds. While, the nuclear genes are biparental inheritance, and genetic materials are transmitted by both seeds and pollens. The seeds of Cycas often fall near the mother plant, resulting in increased inbreeding, which increased homogenization within the population and heterogeneity among populations. Moreover, the evolution rate of nuclear genes was faster than that of chloroplast and mitochondrial, nuclear genes can accumulate more variations.
Therefore, cpDNA data revealed that more genetic variations were distributed among populations.

| Significant genetic structure and the origin of cultivated population
Previous studies had shown that there were low levels of gene flow among populations in most cycads, however, the gene flow among some populations of C. taiwaniana was larger in our study (Aristizábal et al., 2017). The seeds of Cycas species were so heavy that they could only be disperse closely by rodents or gravity. One study had shown that the gene flow of Cycas was 2-7 km (Yang & Meerow, 1996), leading to introgression and enhancing the probability of inbreeding (Hall & Walter, 2011). According to the biological   (Table S6). Hainan island is high in the middle and low in the periphery, with Wuzhishan and Yinggeling as the uplift core . However, when the population size remained stable, the mismatch distribution analysis showed a multimodal curve distribution, and the neutrality test was not significant. Actually, demographic histories inferred are more complex than the parametric models involved in these approaches. The BSP method is based on coalescent theory and it is more reflective of the population historical dynamics.
Base on the BSP, our finding showed that the population historical dynamics of C. taiwaniana were consistent with the seven Cycas species (C. debaoensis, C. simplicipinna, C. bifida, C. multipinnata, C. diannanensis, C. segmentifida, and C. panzhihuaensis) which had experienced population contractions based on cpDNA data (Feng et al., 2014Gong, 2015;Gong et al., 2015;Liu et al., 2015;Xiao et al., 2020;Zhan et al., 2011). In contrast, the population history dynamics revealed by different nuclear genes were more complex. For nuclear gene PHYP, most species had experienced population expansion in history (C. taiwaniana, C. guizhouensis, and C. chenii;Feng, Zheng, & Gong, 2016;Yang et al., 2017). For nuclear gene PPRC, C. taiwaniana had experienced population expansion, but C. dolichophylla had experienced population contractions . Kya, and there were relatively periodic fluctuations of cold-warm, dry-humid (Yan, 2006). According to the latest age of the basalt in Haikou Geopark, the last basalt eruption was about 8 Kya years ago (Liang, 2018), so it was speculated that the volcanic ash might have fallen on Hainan Island to form land. After the last glacial period of late Quaternary and in the early Holocene, the climate of Hainan Island was getting warmer, rainfall increased, and plant growth flourished, C. taiwaniana was no exception in increasing its population size until now (Jin et al., 2003(Jin et al., , 2008Wang et al., 2018;Yan, 2006).

| Future protection strategy
At present, lots of lives on earth are facing the sixth mass extinction, which is caused by human activities, climate change, and ecological collapse (Teixeira & Huber, 2021). Whether based on DNA data or RAD-seq data, the genetic diversity of C. taiwaniana was relatively low (Tao et al., 2021). Two recent studies on the community structure of C. taiwaniana in a certain area revealed that it was a stable population with low growth, but high forest canopy density, habitat destruction, human disturbance and illegal trade might be the reason why C. taiwaniana was endangered Xie et al., 2019).
Population geneticists usually use empirical assessments of genetic diversity to support one of the core objectives of conservation genetics which is to maintain genetic diversity among individuals in ways that support the sustainability of populations and species, even in the face of continuing threats such as global climate change and fragmentation (Crandall et al., 2000;Moritz, 2002). Among the 12 wild populations of C. taiwaniana, the genetic diversity of population DL was the highest and the population NWH had relatively high genetic differentiation with other populations, both of which should be given priority protection. To prevent the loss of genetic diversity or unique genotype, we can use asexual reproduction techniques to maintain genetic diversity within populations, such as bud suctioning.
Effective population size is also an important parameter to evaluate the endangered status of a population or species. In C. taiwaniana, the effective population size of populations DLS1, DLS2, and GSL were more than 200 and the other populations were less than 100. The heterozyosity of small populations declined more rapidly than that of large populations, so these populations (BLS, WX, DL, and NBS) might face a high risk of losing partial genetic variation.
In addition, 12 wild populations of C. taiwaniana were divided into Inbreeding can increase homozygous, leading to the reduction of genetic diversity in small populations, which is extremely dangerous for small groups (Liu & Zhao, 1999). Therefore, combined with results of the estimation of effective population size and the division of significant evolutionary units, appropriate artificial pollination can be implemented in the population BLS and ESU2 to prevent and/or reduce inbreeding. In addition, the genetic diversity of the cultivated population FJ is lower than wild populations, indicating inefficient ex situ conservation. In order to preserve more genetic information, we can select as many individuals (collecting seeds) with high genetic diversity in each population as possible for ex situ protection in the future. Compared with genomic data, the limitations of this study are in existence, and in the future, genomic data will be needed to explore population genetics of more endangered species for formulating more appropriate conservation strategies.

ACK N OWLED G M ENTS
This research was funded by the Natural Science Foundation of Yunnan Province (202101AU070070), the National Natural Science Funds of China (31800191) and CAS "Light of West China" Program to Xiu-Yan Feng.

CO N FLI C T O F I NTE R E S T
The authors declare no conflicts of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
DNA sequences: GenBank accessions in Table S3. The data that supports the findings of this study are available in the supplementary material of this article.